Bitonic sorter

Bitonic sorter

Bitonic sort network with eight inputs.
Class Sorting algorithm
Data structure Array
Worst case performance O(log^2(n)) parallel time
Best case performance O(log^2(n)) parallel time
Average case performance O(log^2(n)) parallel time
Worst case space complexity O(n   log^2(n)) comparators

Bitonic mergesort is a parallel algorithm for sorting. It is also used as a construction method for building a sorting network. The algorithm was devised by Ken Batcher. The resulting sorting networks consist of O(n log2(n)) comparators and have a delay of O(log2(n)), where n is the number of items to be sorted.[1]

A sorted sequence is a monotonically non-decreasing (or non-increasing) sequence. A bitonic sequence is a sequence with x_0 \leqq \ldots \leqq x_k \geqq \ldots \geqq x_{n-1} for some k, 0 \leqq k < n, or a circular shift of such a sequence.

Contents

How the algorithm works

The following is a bitonic sorting network with 16 inputs:

The 16 numbers enter at the inputs at the left end, slide along each of the 16 horizontal wires, and exit at the outputs at the right end. The network is designed to sort the elements, with the largest number at the bottom.

The arrows are comparators. Whenever two numbers reach the two ends of an arrow, they are compared to ensure that the arrow points toward the larger number. If they are out of order, they are swapped. The colored boxes are just for illustration and have no effect on the algorithm.

Every red box has the same structure: each input in the top half is compared to the corresponding input in the bottom half, with all arrows pointing down (dark red) or all up (light red). If the inputs happen to form a bitonic sequence, then the output will form two bitonic sequences. The top half of the output will be bitonic, and the bottom half will be bitonic, with every element of the top half less than or equal to every element of the bottom half (for dark red) or vice versa (for light red). This theorem is not obvious, but can be verified by carefully considering all the cases of how the various inputs might compare, using the zero-one principle.

The red boxes combine to form blue and green boxes. Every such box has the same structure: a red box is applied to the entire input sequence, then to each half of the result, then to each half of each of those results, and so on. All arrows point down (blue) or all point up (green). This structure is known as a butterfly network. If the input to this box happens to be bitonic, then the output will be completely sorted in increasing order (blue) or decreasing order (green). This property is obvious. If a number enters the blue or green box, then the first red box will sort it into the correct half of the list. It will then pass through a smaller red box that sorts it into the correct quarter of the list within that half. This continues until it is sorted into exactly the correct position. Therefore, the output of the green or blue box will be completely sorted.

The green and blue boxes combine to form the entire sorting network. For any arbitrary sequence of inputs, it will sort them correctly, with the largest at the bottom. This property is obvious. The output of each green or blue box will be a sorted sequence, so the output of each pair of adjacent lists will be bitonic, because the top one is blue and the bottom one is green. Each column of blue and green boxes takes N sorted sequences and concatenates them in pairs to form N/2 bitonic sequences, which are then sorted by the boxes in that column to form N/2 sorted sequences. This process starts with each input considered to be a sorted list of one element, and continues through all the columns until the last merges them into a single, sorted list. Because the last stage was blue, this final list will have the largest element at the bottom.

Each green box performs the same operation as a blue box, but with the sort in the opposite direction. So, each green box could be replaced by a blue box followed by a crossover where all the wires move to the opposite position. This would allow all the arrows to point the same direction, but would prevent the horizontal lines from being straight. However, a similar crossover could be placed to the right of the bottom half of the outputs from any red block, and the sort would still work correctly, because the reverse of a bitonic sequence is still bitonic. If a red box then has a crossover before and after it, it can be rearranged internally so the two crossovers cancel, so the wires become straight again. Therefore, the following diagram is equivalent to the one above, where each green box has become a blue plus a crossover, and each orange box is a red box that absorbed two such crossovers:

The arrowheads are not drawn, because every comparator sorts in the same direction. The blue and red blocks perform the same operations as before. The orange blocks are equivalent to red blocks where the sequence order is reversed for the bottom half of its inputs and the bottom half of its outputs. This is the most common representation of a bitonic sorting network.

Example code

The following is an implementation of the bitonic mergesort sorting algorithm in Python. The input is a boolean value up, and a list x of length a power of 2. The output is a sorted list that is ascending if up is true, and decreasing otherwise.

def bitonic_sort(up,x):
    if len(x)<=1:
        return x
    else: 
        first = bitonic_sort(True,x[:len(x)/2])
        second = bitonic_sort(False,x[len(x)/2:])
        return bitonic_merge(up,first+second)
 
def bitonic_merge(up,x): 
    # assume input x is bitonic, and sorted list is returned 
    if len(x) == 1:
        return x
    else:
        bitonic_compare(up,x)
        first = bitonic_merge(up,x[:len(x)/2])
        second = bitonic_merge(up,x[len(x)/2:])
        return first + second
 
def bitonic_compare(up,x):
    dist = len(x)/2
    for i in range(dist):  
        if (x[i] > x[i+dist]) == up:
            x[i], x[i+dist] = x[i+dist], x[i] #swap
 
>>> bitonic_sort(True,[10,30,11,20,4,330,21,110])
[4, 10, 11, 20, 21, 30, 110, 330]
>>> bitonic_sort(False,[10,30,11,20,4,330,21,110])
[330, 110, 30, 21, 20, 11, 10, 4]

The implementation in Java:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
 
public class BitonicMergeSort{
 
    static <T extends Comparable<? super T>> List<T> bitonic_sort(boolean up, List<T> list) {
        if (list.size() <= 1)
            return list;
        List<T> first = new ArrayList<T>(list.subList(0, list.size() / 2));
        first = bitonic_sort(up, first);
        List<T> second = new ArrayList<T>(list.subList(list.size() / 2, list.size()));
        second = bitonic_sort(!up, second);
        first.addAll(second);
        return bitonic_merge(up, first);
    }
 
    static <T extends Comparable<? super T>> List<T> bitonic_merge(boolean up, List<T> list) {
        if (list.size() == 1)
            return list;
        bitonic_compare(up, list);
        List<T> first = new ArrayList<T>(list.subList(0, list.size() / 2));
        first = bitonic_merge(up, first);
        List<T> second = new ArrayList<T>(list.subList(list.size() / 2, list.size()));
        second = bitonic_merge(up, second);
        first.addAll(second);
        return first;
    }
 
    static <T extends Comparable<? super T>> List<T> bitonic_compare(boolean up, List<T> list) {
        int dist = list.size() / 2;
        for (int i = 0; i < dist; i++) {
            int result = list.get(i).compareTo(list.get(i + dist));
            if (up && result > 0 || !up && result < 0) {
                T temp = list.get(i);
                list.set(i, list.get(i + dist));
                list.set(i + dist, temp);
            }
        }
        return list;
    }
 
    public static void main(String[] args) {
        List<Integer> original = Arrays.asList(10, 30, 11, 20, 4, 330, 21, 110);
        System.out.printf("Original List:\n %s \n", original);
 
        List<Integer> ascSorted = bitonic_sort(true, original);
        System.out.printf("\nSorted List (Ascending order):\n %s \n", ascSorted);
 
        List<Integer> descSorted = bitonic_sort(false, original);
        System.out.printf("\nSorted List (Descending order):\n %s \n", descSorted);
    }
}

The below can be used on a parallel architecture to sort efficiently on multiple processors. The following code is very simplified - it assumes that 2^k processors (where k is an integer) are being used and the list is 2^n elements long. The code is written in C using MPI. Comments are included for clarity.

...
int main(int argc, char **argv){
 
	/* Set up MPI */
	MPI_Init( &argc, &argv );
	MPI_Comm_rank( MPI_COMM_WORLD, &my_rank ); /* Get My Process Rank */
	MPI_Comm_size( MPI_COMM_WORLD, &p ); /* How many processes are being used */ 
 
	/* We will assume that the array (arr) of length SIZE, consists of random integers. */ 
 
 
	/* The master process must distribute the list among all available processors.  Each processor must receive the list  */
	/*  and sort it in the proper order */
	if( my_rank == 0 ){
 
		int i;
		int n_p = SIZE / p; /* Number of elements per processor */
 
		/* Store all needed local quantities for the master process */
		localSize = n_p;
		localUp = 1;
		localArr = malloc( n_p * sizeof( int ) );
 
		/* Distribute the array portions amongst all of the available processors */
		/* Alternating the sort orders */
		int up, currentIndex;
		for(i = 1; i <= p; i++){
			up = (i + 1) % 2; //Must be 0 for even numbered sets (odds) and 1 for odds
			currentIndex = i * n_p; 
 
			MPI_Send( &up, 1, MPI_INT, i, tag, MPI_COMM_WORLD );
			MPI_Send( &n_p, 1, MPI_INT, i, tag, MPI_COMM_WORLD );
			MPI_Send( &(arr[currentIndex]), n_p, MPI_INT, i, tag, MPI_COMM_WORLD );
		}//end for : sending to all processors
 
		/* Load the first n/p elements into the master's local array */
		/* (This was done first to allow the other processors to begin sorting) */
		for(i = 0; i < n_p; i++){
			localArr[i] = arr[i];
		}//end for : Load the master's local array
 
		/* Have the master processor sort the local portion of the list */
		bitonic_sort( 1, localArr, 0, n_p - 1 );
 
	}else{
			/* Receive the assignment data from the master process */
			int n, up;
			MPI_Recv( &up, 1, MPI_INT, source, tag, MPI_COMM_WORLD, &status );
			MPI_Recv( &n, 1, MPI_INT, source, tag, MPI_COMM_WORLD, &status );
			int arr[n];
			MPI_Recv( &arr, n, MPI_INT, source, tag, MPI_COMM_WORLD, &status );
 
			/* Store all values into the local variables */
			localSize = n;
			localUp = up;
			localArr = malloc( n * sizeof( int ) );
			int i;
			for(i = 0; i < n; i++){
				localArr[i] = arr[i];
			}//end for : load local array
 
			/* Perform the Assigned sorting task*/
			bitonic_sort( up, localArr, 0, n - 1 );
 
	}//end if-else : sort initial lists
 
 
 
	/* All processors must wait for the other to complete their local sorts before */
	/* continuing the sort */
	MPI_Barrier( MPI_COMM_WORLD );
 
 
	/* Begin inter-processor merging operations */
	int i,j;
	for( i = 0; i < log_p + 1 ; i++ ){
		for( j = i; j > 0; j-- ){
 
			/* compare and exchange elements with the processor obtained by inverting jth bit of id*/
			int id[p];
			getBinary( my_rank, id, p);
			int ithBit = id[i];	
			flipBit( id, j ); /* Flip the jth bit of id to find processor to exchange with*/
			int partnerId = getDecimal( id, p);
 
 
			if( ithBit == 0 ){
				if( my_rank > partnerId ){
					swap_with( partnerId, 0);
				}else if( my_rank < partnerId ){
					swap_with( partnerId, 1);
				}else{
					bitonic_merge( 1, localArr, 0, localSize - 1 );
				}//end if-else
			}else{
				if( my_rank > partnerId ){
					swap_with( partnerId, 1); 
				}else if( my_rank < partnerId ){
					swap_with( partnerId, 0);
				}else{
					bitonic_merge( 0, localArr, 0, localSize - 1 );
				}//end if-else
			}//end if-else	
 
		}//end for
	}//end for 
 
 
	/* Perform 1 final local merge on all processors */
	bitonic_merge( 1, localArr, 0, localSize - 1 );
 
 
	MPI_Finalize(); /* Perform Cleanup tasks */
 
	...
 
}//end main method

References

  1. ^ http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/bitonic/oddn.htm

External links

-